A Voice Dictation System for a Million-Word Czech Vocabulary

نویسندگان

  • Jan Nouza
  • Tomáš Nouza
چکیده

The paper describes a set of techniques developed for discrete dictation within a vocabulary that contains up to a million entries, which is one of the main challenges in highly inflected languages like Czech. We present our approach to building an efficiently coded tree lexicon with suffix sub-trees and morphologic classification. Acoustic modeling is based on either monophone, diphone, or triphone models. Lexical and grammatical constraints are represented by unigrams and bigrams, where the latter have form of binary matrix describing grammatical admissibility between word categories. The complete system has been evaluated on a 792,338 word lexicon. Its real-time implementation yields a word-error rate smaller than 12%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Very large vocabulary voice dictation for mobile devices

This paper deals with optimization techniques that can make very large vocabulary voice dictation applications deployable on recent mobile devices. We focus namely on optimization of signal parameterization (frame rate, FFT calculation, fixedpoint representation) and on efficient pruning techniques employed on the state and Gaussian mixture level. We demonstrate the applicability of the propose...

متن کامل

Study on Cross-Lingual Adaptation of a Czech LVCSR System towards Slovak

This paper deals with cross-lingual adaptation of a Large Vocabulary Continuous Speech Recognition (LVCSR) system between two similar Slavic languages – from Czech to Slovak. The proposed adaptation scheme is performed in two consecutive phases and it is focused on acoustic modeling and phoneme and pronunciation mapping. It also utilizes language similarities between the source and the target l...

متن کامل

MAP Based Speaker Adaptation in Very Large Vocabulary Speech Recognition of Czech

The paper deals with the problem of efficient adaptation of speech recognition systems to individual users. The goal is to achieve better performance in specific applications where one known speaker is expected. In our approach we adopt the MAP (Maximum A Posteriori) method for this purpose. The MAP based formulae for the adaptation of the HMM (Hidden Markov Model) parameters are described. Sev...

متن کامل

Design and development of voice controlled aids for motor-handicapped persons

In this paper we present two voice-operated systems that have been designed for Czech motor-handicapped people to allow them full access to computers and computer based services. The programs, which are named MyVoice and MyDictate, are complementary in their functions. Both employ ASR engines developed in our lab. The former is used primarily as a midsize-vocabulary (up to 10K words) voice comm...

متن کامل

EasyCmd: Navigation by Voice Commands

In this paper we present a system named EasyCmd that provides voice navigation on the desktop of Microsoft Window 9x system. Speech recognition engine for EasyCmd is much similar to that for dictation machine. Statistical Knowledge Based Frame Synchronous Search algorithm (SKBFSS) and Word Search Tree (WST) technologies are applied for acoustic decoding. Recognition Score Gap (RSG) is used for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004